Dataset description

Row

Intro

The Titanic dataset is composed by two groups:

  • Train dataset
  • Test dataset

Row

Number of Measurements

Variable Types

Row

Gender Pie

Class Pie

Survived Pie

Row

All numerical variables are going to be analysed taking into consideration the survived column. The categorical variables will be evaluated in terms of relevance for the percentage of survived people.

Univariate View

Row

In this section only numeric variables are presented and its impact on the survived status is preliminary evaluated.

Row

Gender effect on survival status - Absolute Counts

Gender effect on survival status - Percentage for total of people

Row

Survived vs Class (ticket)

Row

Survived vs Age

Row

Survived vs Fare

Logistic Regression

In this section a simple classification model is run in order to evaluate the variables that increases the most the probability of survival.

   Survived Predicted   N
1:        0  Survived 390
2:        1      Died 194
3:        1  Survived  96
4:        0      Died  34